Skip to content

Conversation

simoll
Copy link
Collaborator

@simoll simoll commented Sep 3, 2025

This is based on the simoll/ser_exectest_patch branch (#7379) for easier diffing, which was the last ingested SER execution test patch.
Actual changes are in the last two commits:

1. [SER] Execution test update

Issue: https://github.com/microsoft/hlsl-specs/issues/613

Changes:
- Added permutation testing for getters and intersection attributes.
  Combinations of:
  * Producing HitObjects (HitObject::TraceRay, HitObject::FromRayQuery,
    HitObject::MakeMiss)
  * Querying property from HitObject (Direct HitObject getter or
    HitObject::Invoke+classic getter)
  * HitObject tested in raygen,closesthit or miss (adds recursion)
  * with and without reorder

- Fused all Scalar/VectorMatrixGetter tests into one SERGetterPermutationTest
- Added SERAttributesPermutationTest: Testing procedural and triangle
  intersection attributes

- Added SERNOPValuesTest: Explicit test for default return values in NOP
  HitObjects.

- Added SERMultiPayloadTest: Testing multiple live HitObjects with
  differing payload types, SBT indices and loop control flow.

2. [SER] HitObject::GetAttributes change (off by default)

Issue: https://github.com/microsoft/hlsl-specs/issues/612

Adds code paths with new HitObject::GetAttributes API (guarded by #if NEW_GETATTRIBUTES_API to help transition).

Greg Roth and others added 30 commits March 31, 2025 18:25
This change adds vector and multi-dimensional overload support for DXIL operations.

Incorporates change to add vector overloads from @pow2clk. This includes changes to hctdb*.py DxilOperations.* and DxilValidation.cpp.
hctdb.py:
    Updates vector character from 't' to '<'
    Allows legal vector element overloads to be specified after vector overload character.
    Defaults legal vector element overloads to the scalar element overloads, if any.
    Adds 'x' for extended overload mechanism, which supports up to 2 overload dimensions at this point, but is easily expandable if necessary.
    Processes syntax using ',' to separate multiple overloads and defaulting vector element overloads into new breakdown of main overload string, vector overload list of strings, and list of extended overloads (used if main overload set to 'x').
DxilOperations:
    Extend OpCodeProperty with ExtendedOverloads and AllowedVectorElements arrays using new OverloadMask.
    When extended bit set in main overloads, ExtendedOverloads array is used for each overload dimension
    When vector bit set in main overloads or each ExtendedOverloads dimension, AllowedVectorElements are set for corresponding dimension index for allowed element types.
    Updated generated DXIL op table
    Remove unused static methods in hlsl::OP
     I think these were leftover from an attempt to work around a name collision
     ultimately caused by dx.types.CBufRet.f16|i16 between min-precision (with 4 elements)
     and native low precision (with 8 elements), caused by failing to initialize the
     min precision mode correctly for linking.
     That issue was fixed, and the names made unique by adding ".8" to the end of
     the 8-element native low precision cbuffer return type.
     eliminate use of std::string, std::vector
      - Use Twine, raw_svector_ostream, and SmallVector storage to replace uses of std::string
      - Use SmallVector instead of std::vector for ArgTypes in GetOpFunc

Rework DXIL op overload system

Add comments explaining the new system.
Eliminate bool array in favor of array of masks for up to N dimensions.
Add NumOverloadDims instead of two-mode system.

Rework TypeSlots:
- use enum, categorize basic, limit masks to used bits
- void doesn't need a type slot (NumOverloadDims == 0 instead)
- m_OverloadTypeName only contains basic type names

Handle multi-overload in FixOverloadNames; new MayHaveNonCanonicalOverload
is used to determine whether the overload name could need fixing.

Extended overload is still a distinction because of the way the overloads
must be wrapped in an unnamed StructType.
However, it does not need a bit in the overload mask.

Renamed GetVectorType to GetStructVectorType, since it's just used to get a
struct for a particular vector type, not a vector type itself.

In hctdb.py, no longer separate extended and vector overloads, just verify correctness
of the incoming string, and add default vector overloads if necessary.

In hctdb_instrhelp.py, update according to changes in hctdb.py, and eliminate
needless, problematic, outdated comment printing.
Enable native vector DXIL intrinsic overload for vector load/store

Add a new native vector overload type to DXIL intrinsics and the corresponding generation.
Add new raw buffer vector load/store intrinsics that use that overload type.

Generate native vector raw buffers load/stores

When the loaded/stored type is a vector of more than 1 element, the
shader model is 6.9 or higher, and the operation is on a raw buffer,
enable the generation of a native vector raw buffer load or store.

Add validation of vector load stores
actually allow the given ops to take vectors
add vector overload type and apply to the relevant builtins

Build lowering functions to allow vector supporting intrinsics through

Preliminary groupshared support. keep groupshared as vectors for 6.9. They are no longer represented as inidivual groupshared scalars.

adds groupshared to the test and performs the switch to CS to allow it.

Support dot product on long vecs by expanding the inrinsic into
mul/mad ops like is done with integer dot products

Since the or() and and() intrinsics did their own scalarization, the or/and operators would never be applied to full vectors. This leaves the scalarization for the scalarization pass, which will skip it for 6.9
…soft#7309)

This PR modifies WaveSizeRange test which depends on shader model 6.8.
The compiler needs -select-validator internal.
This will allow the tests to be run in different testing environments
when an external validator that isn't sufficient is available.
The name matches for azure pipelines to run excluded staging branches.
This adds them in.

Not sure this is desired, but it's here if we want it.
Conflicts:

        both modified:   include/dxc/DXIL/DxilConstants.h
        both modified:   include/dxc/DXIL/DxilOperations.h
        both modified:   lib/DXIL/DxilOperations.cpp
        both modified:   lib/DxilValidation/DxilValidation.cpp
        both modified:   lib/HLSL/DxilLinker.cpp
        both added:      lib/HLSL/DxilScalarizeVectorLoadStores.cpp
        both modified:   lib/HLSL/HLOperationLower.cpp
        both modified:   tools/clang/lib/Sema/SemaHLSL.cpp
both added:
tools/clang/test/CodeGenDXIL/hlsl/intrinsics/buffer-load-stores-sm69.hlsl
both added:
tools/clang/test/CodeGenDXIL/hlsl/types/longvec-operators-cs.hlsl
both added: tools/clang/test/DXILValidation/vector-validation.hlsl
        both modified:   utils/hct/hctdb.py
        both modified:   utils/hct/hctdb_instrhelp.py
Merge latest main into staging-sm6.9.
Update staging-sm6.9 with latest main.
This merges main into staging-sm6.9.  There were no conflicts.
…ics (microsoft#7290)

Implements
HLSL:
__builtin_MatVecMul
__builtin_MatVecMulAdd
__builtin_OuterProductAccumulate
__builtin_VectorAccumulate

Lowered to
DXIL:
@dx.op.matVecMul
@dx.op.matVecMulAdd
 @dx.op.outerProductAccumulate
 @dx.op.vectorAccumulate

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Damyan Pepper <[email protected]>
Co-authored-by: Simon Moll <[email protected]>
Co-authored-by: Tex Riddell <[email protected]>
Co-authored-by: Chris B <[email protected]>
…crosoft#7366)

DXC seems to be building inocrrectly with GCC-13 and later, which is
causing our pre-merge testing on 24.04 to fail. This will take some time
to sort out, so in the meantime I'm reverting to 22.04 on our pipelines.

(cherry picked from commit b4a3076)

Co-authored-by: Chris B <[email protected]>
)

This PR is a basic implementation of the priority long vector execution
tests microsoft#7260.
Merge main into staging-sm6.9
…rosoft#7375)

Moving the long vector test utility functionality to its own file to
help make subsequent reviews easier. No new code or logic updates.
- All trivial scalar/vector/matrix getters
- HitObject::FromRayQuery with procedural hit
- HitObject::GetAttributes<T> with custom attributes and procedural hit

SER implementation tracker: microsoft#7214
This PR introduces the linear algebra header file, and places it in a
location that is by default included in all HLSL compilation.
The builtins in the API aren't yet defined, and depend on the microsoft#7290 PR
merging first.
The tests that have been added have temporary diagnostic messages while
7290 is in progress. They will need to be updated.
Open to feedback on better / suggested error messages, or whether there
shouldn't be any sema-level validation for these errors.

Fixes
[microsoft#7304](microsoft#7304)

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Merge was clean apart from one conflict in HLOperations.h where
HitObject::FromRayQuery wasn't in the staging version and got added.
- Array of 8 HitObjects, each with different ray flags
- Samples at 'random' positions to block SROA from breaking down the
  array
simoll and others added 12 commits May 6, 2025 10:50
(procedural)

- Support for procedural geometry and triangles at the same time
- fix: make IS non-optional
- Made payload/attributeCount in RunDXRTest non-defaulting
- Use a circle for procedural geoemtry and make sure it fits into the
  AABB
…rProductAccumulate (microsoft#7424)

ExecutionTest::CoopVec_MulAdd:

Functional verification for the Mul() and MulAdd() HLSL APIs. The driver
matrix conversion API is tested as well. These tests should be
considered
as work-in-progress as this point. They include coverage primarily for
SINT8,
FLOAT16, FLOAT_E4M3, and FLOAT_E5M2. The test queries the driver for all
supported configurations and runs each one, with a filtering mechanism
to
limit the set of tests to the minimal feature set. The set of tests can
be
further filtered by the following TE parameters:

  CoopVecMatrixInterp: SINT8, FLOAT16, FLOAT_E4M3, ...
CoopVecMatrixLayout: ROW_MAJOR, COLUMN_MAJOR, MUL_OPTIMAL,
OUTER_PRODUCT_OPTIMAL
  CoopVecBiasInterp: SINT32, FLOAT16, FLOAT_E4M3, ...
  CoopVecInputInterp: SINT8, FLOAT16, FLOAT_E4M3, ...
CoopVecInputType: SINT8, UINT8, SINT16, UINT16, SINT32, UINT32, FLOAT16,
FLOAT32, ...
  CoopVecOutputType: SINT32, UINT32, FLOAT16, FLOAT32, ...

Filter example:
$ TE.exe ... -p:CoopVecMatrixInterp=FLOAT16
-p:CoopVecMatrixLayout=MUL_OPTIMAL

Precision coverage is minimal at this point, using an all-ones input
matrix and
test vector with ones in the first two components. This is enough to
test basic
functionality, but more comprehensive tests are needed.

ExecutionTest::CoopVec_OuterProduct:

Functional verification for the OuterProductAccumulate() HLSL API. This
test queries
the driver for all supported configurations and runs each one. No
filtering is
currently implemented.
Issue: microsoft/hlsl-specs#613

Changes:
- Added permutation testing for getters and intersection attributes.
  Combinations of:
  * Producing HitObjects (HitObject::TraceRay, HitObject::FromRayQuery,
    HitObject::MakeMiss)
  * Querying property from HitObject (Direct HitObject getter or
    HitObject::Invoke+classic getter)
  * HitObject tested in raygen,closesthit or miss (adds recursion)
  * with and without reorder

- Fused all Scalar/VectorMatrixGetter tests into one SERGetterPermutationTest
- Added SERAttributesPermutationTest: Testing procedural and triangle
  intersection attributes

- Added SERNOPValuesTest: Explicit test for default return values in NOP
  HitObjects.

- Added SERMultiPayloadTest: Testing multiple live HitObjects with
  differing payload types, SBT indices and loop control flow.
Issue: microsoft/hlsl-specs#612

Adds code paths with new HitObject::GetAttributes API (guarded by #if NEW_GETATTRIBUTES_API to help transition).
Copy link
Contributor

github-actions bot commented Sep 3, 2025

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff 4e0f5364a3692f4122de0874ebb0f5550a27c867 4df10b9bac57b5375f01bca0a181e97ba933ffa2 -- tools/clang/unittests/HLSLExec/CoopVec.h tools/clang/unittests/HLSLExec/CoopVecAPI.h tools/clang/unittests/HLSLExec/DXRUtil.h tools/clang/unittests/HLSLExec/ExecutionTest_SER.h tools/clang/unittests/HLSLExec/LongVectors.h include/dxc/Test/HlslTestUtils.h include/dxc/Test/WEXAdapter.h lib/HLSL/DxilLinker.cpp tools/clang/lib/Sema/SemaHLSL.cpp tools/clang/unittests/HLSLExec/ExecutionTest.cpp tools/clang/unittests/HLSLExec/ShaderOpTest.cpp
View the diff from clang-format here.
diff --git a/tools/clang/unittests/HLSLExec/ExecutionTest.cpp b/tools/clang/unittests/HLSLExec/ExecutionTest.cpp
index 16ccf9f1..7e21efb7 100644
--- a/tools/clang/unittests/HLSLExec/ExecutionTest.cpp
+++ b/tools/clang/unittests/HLSLExec/ExecutionTest.cpp
@@ -2140,10 +2140,10 @@ public:
     int NumMissShaders = 1;
     int NumHitGroups = 1;
   };
-  CComPtr<ID3D12Resource>
-  RunDXRTest(ID3D12Device *Device0, LPCSTR ShaderSrc, LPCWSTR TargetProfile,
-             LPCWSTR *Options, int NumOptions, std::vector<int> &TestData,
-             const DXRRunConfig &Config);
+  CComPtr<ID3D12Resource> RunDXRTest(ID3D12Device *Device0, LPCSTR ShaderSrc,
+                                     LPCWSTR TargetProfile, LPCWSTR *Options,
+                                     int NumOptions, std::vector<int> &TestData,
+                                     const DXRRunConfig &Config);
 
   CComPtr<ID3D12Resource> RunDXRTest(ID3D12Device *Device0, LPCSTR ShaderSrc,
                                      LPCWSTR TargetProfile, LPCWSTR *Options,
@@ -2528,12 +2528,14 @@ ExecutionTest::RunDXRTest(ID3D12Device *Device0, LPCSTR ShaderSrc,
     std::wstring AnyHit;
     std::wstring Intersection;
     std::wstring HitGroupName;
-    const bool IsProcedural() const { return !Intersection.empty();}
+    const bool IsProcedural() const { return !Intersection.empty(); }
   };
   std::vector<HitGroupDesc> HitGroupDescs;
 
-  const bool PrimaryHitGroupsAreAABB = !Config.UseMesh && Config.UseProceduralGeometry;
-  const bool EnableSecondaryHitGroups = Config.UseMesh && Config.UseProceduralGeometry;
+  const bool PrimaryHitGroupsAreAABB =
+      !Config.UseMesh && Config.UseProceduralGeometry;
+  const bool EnableSecondaryHitGroups =
+      Config.UseMesh && Config.UseProceduralGeometry;
 
   // Base hit group
   HitGroupDesc PrimaryHitGroup{L"closesthit", L"anyhit", L"", L"HitGroup"};
@@ -2589,7 +2591,8 @@ ExecutionTest::RunDXRTest(ID3D12Device *Device0, LPCSTR ShaderSrc,
     Lib->DefineExport(Export.c_str());
 
   StateObjectDesc.CreateSubobject<CD3DX12_RAYTRACING_SHADER_CONFIG_SUBOBJECT>()
-      ->Config(Config.PayloadCount * sizeof(float), Config.AttributeCount * sizeof(float));
+      ->Config(Config.PayloadCount * sizeof(float),
+               Config.AttributeCount * sizeof(float));
   StateObjectDesc
       .CreateSubobject<CD3DX12_RAYTRACING_PIPELINE_CONFIG_SUBOBJECT>()
       ->Config(Config.MaxRecursion);
@@ -2630,13 +2633,12 @@ ExecutionTest::RunDXRTest(ID3D12Device *Device0, LPCSTR ShaderSrc,
   VERIFY_SUCCEEDED(StateObject->QueryInterface(&StateObjectProperties));
 
   // Create SBT
-  ShaderTable ShaderTable(
-      Device,
-      1,                                        // raygen count
-      Config.NumMissShaders,                    // miss count
-      (int) HitGroupDescs.size(),               // hit group count
-      1,                                        // ray type count
-      4                                         // dwords per root table
+  ShaderTable ShaderTable(Device,
+                          1,                         // raygen count
+                          Config.NumMissShaders,     // miss count
+                          (int)HitGroupDescs.size(), // hit group count
+                          1,                         // ray type count
+                          4                          // dwords per root table
   );
 
   int LocalRootConsts[4] = {12, 34, 56, 78};
@@ -2666,10 +2668,10 @@ ExecutionTest::RunDXRTest(ID3D12Device *Device0, LPCSTR ShaderSrc,
   // hit groups
   for (int HitGroupIdx = 0; HitGroupIdx < HitGroupDescs.size(); HitGroupIdx++) {
     const HitGroupDesc &HitGroupDesc = HitGroupDescs[HitGroupIdx];
-    memcpy(
-        ShaderTable.GetHitGroupShaderIdPtr(HitGroupIdx, 0),
-        StateObjectProperties->GetShaderIdentifier(HitGroupDesc.HitGroupName.c_str()),
-        SHADER_ID_SIZE_IN_BYTES);
+    memcpy(ShaderTable.GetHitGroupShaderIdPtr(HitGroupIdx, 0),
+           StateObjectProperties->GetShaderIdentifier(
+               HitGroupDesc.HitGroupName.c_str()),
+           SHADER_ID_SIZE_IN_BYTES);
     memcpy(ShaderTable.GetHitGroupRootTablePtr(HitGroupIdx, 0), LocalRootConsts,
            sizeof(LocalRootConsts));
   }
@@ -2916,7 +2918,8 @@ ExecutionTest::RunDXRTest(ID3D12Device *Device0, LPCSTR ShaderSrc,
   {
     D3D12_RAYTRACING_INSTANCE_DESC CPUInstanceDescs[2] = {};
     const int MeshIdx = 0;
-    const int ProcGeoIdx = Config.UseMesh && Config.UseProceduralGeometry ? 1 : 0;
+    const int ProcGeoIdx =
+        Config.UseMesh && Config.UseProceduralGeometry ? 1 : 0;
     const int NumInstanceDescs = ProcGeoIdx + 1;
 
     for (int i = 0; i < NumInstanceDescs; ++i) {
diff --git a/tools/clang/unittests/HLSLExec/ExecutionTest_SER.h b/tools/clang/unittests/HLSLExec/ExecutionTest_SER.h
index 553a913f..8ab2c066 100644
--- a/tools/clang/unittests/HLSLExec/ExecutionTest_SER.h
+++ b/tools/clang/unittests/HLSLExec/ExecutionTest_SER.h
@@ -639,11 +639,12 @@ void ahAABB(inout PerRayData payload, in CustomAttrs attrs)
 
 )";
 
-template<typename T>
-static void VerifyTestArray(const T* RefData, const T* TestData, int NumElements);
+template <typename T>
+static void VerifyTestArray(const T *RefData, const T *TestData,
+                            int NumElements);
 
-template<>
-void VerifyTestArray(const int* RefData, const int* TestData, int NumElements) {
+template <>
+void VerifyTestArray(const int *RefData, const int *TestData, int NumElements) {
   for (int i = 0; i < NumElements; i++) {
     if (RefData[i] != TestData[i]) {
       VERIFY_ARE_EQUAL(RefData[i], TestData[i]);
@@ -651,8 +652,9 @@ void VerifyTestArray(const int* RefData, const int* TestData, int NumElements) {
   }
 }
 
-template<>
-void VerifyTestArray(const float* RefData, const float* TestData, int NumElements) {
+template <>
+void VerifyTestArray(const float *RefData, const float *TestData,
+                     int NumElements) {
   for (int i = 0; i < NumElements; i++) {
     const float RefVal = RefData[i];
     const float TestVal = TestData[i];
@@ -1355,8 +1357,8 @@ void intersection1()
     bool EnableRecursion;
 
     void addCompileArgs(std::vector<std::wstring> &OwnedArgs,
-                      std::vector<LPCWSTR> &ArgVec) const {
-      (void)  OwnedArgs;
+                        std::vector<LPCWSTR> &ArgVec) const {
+      (void)OwnedArgs;
       if (EnablePAQs) {
         ArgVec.push_back(L"-DENABLE_PAQS=1");
       } else {
  • Check this box to apply formatting changes to this branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: New

Development

Successfully merging this pull request may close these issues.

7 participants